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When microbes compete for limited resources, they often engage in chemical warfare using bac- 
terial toxins. This competition can be understood in terms of evolutionary game theory (EGT). 
We study the predictions of EGT for the bacterial "suicide bomber" game in terms of the phase 
portraits of population dynamics, for parameter combinations that cover all interesting games for 
two-players, and seven of the 38 possible phase portraits of the three-player game. We compare these 
predictions to simulations of these competitions in finite well-mixed populations, but also allowing 
for probabilistic rather than pure strategies, as well as Darwinian adaptation over tens of thousands 
of generations. We find that Darwinian evolution of probabilistic strategies stabilizes games of the 
rock-paper-scissors type that emerge for parameters describing realistic bacterial populations, and 
point to ways in which the population fixed point can be selected by changing those parameters. 



PACS numbers: 87.23.Kg,87.10.Mn,87.18.Hf 

I. INTRODUCTION 

Evolutionary Game Theory (EGT) TUJl has become 
one of the pillars of evolutionary biology because it is 
a mathematically accessible framework that can account 
for the strategic aspect of frequency-dependent fitness, 
that is, if the fitness of a genotype depends on the fre- 
quency of other genotypes in the population. EGT is 
particularly useful when dealing with populations that 
include a mix of different strategies, and takes into ac- 
count the concept of "inclusive fitness" that encompasses 
how an organism contributes to the fitness of other ge- 
netically similar or even different types [IHS]. Conse- 
quently, EGT has been used to study the conditions for 
the emergence, maintenance, and evolution of coopera- 
tion as well as altruism [THS]. Of particular importance 
is the application of EGT to microbial communities (see, 
e.g., |10jV The competitive interaction between microbes 
(and even viruses [TT1[T2]) is often best described within 
the language of games [T3HI5] , and they display coopera- 
tive and other types of social behavior [TB] , cheating [T7] , 
and even an extreme form of altruism where some com- 
munity members sacrifice themselves for the sake of oth- 
ers. Another hallmark of microbial community dynamics 
is the observation of cyclically competing species with 
non-transitive relationships (18, . Of particular interest is 
EGT's prediction of evolutionary fixed points for some 
dynamical interactions but not others, and characteri- 
zation of the stability of orbits within the phase space 
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of strategies. However, how these predictions compare 
to realistic dynamics of adapting populations of cells, in 
particular those making stochastic decisions [19], is not 
always clear. Here, we study the predictions of EGT 
as applied to a simple bacterial system where different 
strategies of survival potentially coexist in the same pop- 
ulation, or where strategic decisions are made stochasti- 
cally rather than deterministically. 

Among the weapons that bacteria use to battle each 
other is the production of colicin, a bacterial toxin that 
is lethal to most strains of E. coli bacteria. Those strains 
that can produce the toxin are usually unharmed by it 
due to a resistance gene encoded on a plasmid within the 
cell. The individual that actually deploys the toxin, how- 
ever, pays the ultimate price as that cell literally explodes 
to distribute the toxin to as large a fraction of the popu- 
lation as possible. This ultimate act of altruism is benefi- 
cial to the "suicide bomber" 's kin because it frees up the 
resource that both sensitive and resistant types are using, 
for the sole benefit of the resistant types. This system 
has been studied experimentally [151 HO] , and its dynam- 
ics studied in terms of payoff matrices [2T] . Some of the 
dynamics we observe falls into the "Rock-Paper-Scissors" 
(RPS) category of games, which have been studied an- 
alytically and experimentally [2 H [TOl [IHl I^^PS] (see 
also the review [3^). We study the EGT predictions of 
the equilibrium frequency of strategies in the population 
(when this equilibrium exists), how this equilibrium is 
modified when additional strategies can be produced via 
mutations, and how EGT predictions fare when popula- 
tion sizes are finite. To study these predictions we use 
different numerical simulation methods, and in particular 
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study strategies that evolve via mutation and selection in 
a purely Darwinian manner |27j . Furthermore, because 
the expression of the toxin is probabilistic in nature (ex- 
perimentally, only between 1% and 5% of the bacteria 
that carry the toxin plasmid actually express it [201 [H]), 
we study fully stochastic strategies with evolvable prob- 
abilities. In the following, we first introduce the notation 
and a discussion of equilibrium points in the well-known 
two-player game, setting the stage for a similar analysis 
of the three-player game. 



II. TWO-PLAYER SUICIDE BOMBER GAME 

Evolutionary game theory makes accurate predictions 
about the outcome of two-player games, whether the 
strategies are deterministic ('pure' strategies) or prob- 
abilistic ('mixed' strategies, where a player uses different 
pure strategies with different probabilities). The cen- 
tral concept of EGT is the "evolutionary stable strategy" 
(ESS): If only one strategy is determined to be an ESS, 
we should find this and only this strategy to be the win- 
ner in a competition. In a game with two strategies / 
and J, / is an ESS if the payoff E{I, I) when playing 
itself is larger than the payoff E{J, I) between any other 
strategy J and /, i.e., [T] 



/ is ESS if E{I, I) > E{J, I) 



(1) 



In case E{I, I) = E{J, I), then / is an ESS if the strategy 
plays better against any other strategy J then that other 
strategy fares against itself: 

E{I, J) > E{J, J) when E{I, I) = E{J, I). (2) 

In principle, a mixture of strategies can be an ESS. So 
introduce the population mixture described by the vec- 
tor p = (pitPj) where pi is the population fraction of 
strategy / and pi + p,j ~ 1. Then p is an ESS if (and 
only if) for all q (see e.g., [2 ) 



p • £;p > q • £;p , 

and at the same time 



(3) 



if q 7^ p and p • Ep = q • Ep, then p • Ec[ > q • Eq . (4) 

Here, E is the payoff matrix introduced above. Eqs. ( 3p ) 
are just the population generalizations of (Tj2). Condi- 
tion ([3| defines the Nash equilibrium point of the pop- 
ulation, while Q is the stability condition for the Nash 
equilibrium. It turns out, however, that the concept of 
an ESS is not as general as one might wish, because while 
every ESS is a stable attractor of the population dynam- 
ics (in the language of dynamical systems theory), not 
every stable attractor is an ESS '28H30]. Thus, instead 
of focusing on ESSs, in the following we study the fixed 
points and phase portraits of the dynamics that the pay- 
off matrix induces. 

In the two-player "suicide bomber" game of colicin 
warfare, the payoff between the wild-type (which we de- 
note here as strain '00') and the colicin-producing (but 



resistant) type 'RT' is such that i;(00,00) = 1, but 
i?(00,RT) = 0, that is, a wildtype strain is killed by 
a strain expressing the toxin. On the other hand, RT 
is itself inferior to the wildtype because it carries the 
cost of that resistance, so i?(RT,RT) < 1. Finally, 
E(KT, 00) > 1, expressing the advantage the colicin pro- 
ducer has due to the suicidal behavior of its kin [21 . Note 
that the cost of resistance is typically of the order of 15% 
of wild-type fitness, but can be as small as 1% [3T] or as 
large as 60% [32]. The cost of producing the colicin (in- 
cluding the reduction of growth rate by cell lysing) may 
be even higher, depending on the frequency with which 
colicin is being produced |32j. Our notation '00' and 
'RT' for the two different types comes from the observa- 
tion that the production and the resistance of colicin in 
bacteria are usually encoded by two different genes ('R' 
for resistance, and 'T' for toxin), but most often within 
the same plasmid. 

We study a model in which two parameters govern the 
interaction between types: the benefit e > of expressing 
the toxin and the cost uj > 0, and vary these parameters 
systematically. In principle, the two genes R and T could 
each carry a different cost, but for most of the calcula- 
tions in this study we will keep them the same. The ESS 
is determined by the following payoff matrix E: 



00 RT 
00 / 1 

RT I 1 - 2w + e 1 - 2w 



(5) 



The advantage e is a variable that depends on the spatial 
structure of the environment and the distribution of types 
in it, because suicide behavior will be more effective when 
the resource that is freed up is more likely to be used by 
the resistant kind. Thus, we study what strategy is an 
ESS as a function of the parameters w and e. In terms of 
the more conventional notation of two-player games we 
have i? = 1, 5 = 0, T = 1 - 2a; -he, and P 1 - 2w, that 
is, 00 is the cooperating type. Because payoff matrices 
that only differ when adding a constant to each column 
induce the same dynamics, we can bring the matrix into 
normal form so that the diagonal vanishes: 



E = 



2a; - 1 
e- 2a; 



(6) 



The fixed points, stability, and phase portrait of such 
games has been solved (see, e.g, [Ml ED]). The dynamics 
falls into four regions that harbor three different phase 
portraits. In Region I, defined by a; < 1/2, a; < e/2 we 
find the standard prisoner's dilemma, where the "defect- 
ing" strategy is ESS as well as an attractor. We indi- 
cate in Fig. [T] the parameter region for this dynamics, 
along with a pictogram that describes the phase portrait 
with the convention that solid circles represent attrac- 
tors and open circles denote repellers. Thus, in this lan- 
guage the wild-type is the cooperator and the suicide 
bomber the defector. For uj > 1/2, uj > e/2 (region 
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II) there is no dilemma and the wild-type is the ESS. 
When w > 1/2, w < e/2 (region III), the game is an 
"anti-coordination game" sometimes called "snowdrift" 
or "Hawk-Dove" , giving rise to a stable population mix- 
ture of strategies as indicated by the fixed point along 
the trajectory connecting the wild-type and the suicidal 
type RT, with poo — %55f- If on the other hand ijj < }^ 



0.8 



0.6 



0.4 



0.2 



1 1 1 

n 




00 • — . — RT X 


III 


/ 00 o* 


RT _ 






IV / ' 




/ 00 « • RT 

/ill 





0.5 



1.5 



FIG. 1. Fixed points and flow for different two-player games 
as a function of game parameters, for G (0, 1) and e G (0, 2). 
Region I: Prisoner's Dilemma, region II: "Harmony" game, 
region III: Snowdrift game, region IV: "coordination" game 
(see, e.g., [10]). 

and w > e/2 (region IV), both strategies are an ESS 
as well as attractors, and the ultimate winner depends 
on which strategy is more abundant at the start of the 
competition, as was observed in the Chao-Levin exper- 
iment 20 in a well-mixed environment. Note that this 
regime corresponds to a "coordination" game. If strate- 
gies are probabilistic, equations ([s]) and Q predict dom- 
inance of a mixed strategy with the probability to engage 
in '00' play given by the poo shown above ^j. The differ- 
ent predicted phases are consistent with those observed 
for competitions between sensitive and resistant types on 
a lattice [33] . We can confirm the predictions of EGT by 
numerical simulation of the population dynamics using 
replicator equations [U [2| : 

Poo(i) = Vm{i){w^Q - w) , (7) 
w = poo(^)woo + (1 - Poo(t)) wrt , (8) 



with 



t«oo=Poo^^(00,00)-f (I- 
^«RT=Poo^^(RT,00)+(l 



Poo)^^(00,RT) , (9) 
-poo)^(RT,RT) . (10) 



We show the result of this simulation in Fig. [2^, where 
both strategies were initialized with p = 0.5. The replica- 
tor equation simulations recapitulate the phase portraits 
in Fig.[l} 
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FIG. 2. ESS as a function of the parameters e and oj. (a): 
Numerical simulation of the population frequencies based on 
replicator equations, with both types at equal frequency ini- 
tially, (b) Simulation of a finite well-mixed population of 
agents that encode the strategies 00 and RT deterministically, 
with the same initial conditions as in (a) [50 independent sim- 
ulations for each of 21 x 21 different parameter combinations]. 
In (a) and (b), red and green intensity indicates the frequency 
of strategy 00 and RT in the population, respectively. The 
separation between red and green in region IV is due to the 
choice of initial condition, and does not represent a phase 
boundary. 



We can also compare the EGT prediction to a simula- 
tion of the evolution of agents that receive payoffs ([s]) in 
a finite (but large) well-mixed population of 16,384 inter- 
acting agents. Here, each player's strategy is determined 
by a 'genome' with a single locus poo that stands for the 
probability to engage in action 00. For each agent, we 
randomly pick four opponents, and the aggregate score 
against them is used as a fitness. We replace 2% of the 
population at each update, using a death-birth Moran 
process [34] (for more details on the simulations, see Sec- 
tion HI C I . If we start the population with 50% genotypes 
reflecting the wildtype (poo — 1) and 50% expressing the 
RT phenotype (poo = 0) without any mutations so that 



the pure strategies compete, we recapitulate the theoret- 
ical results as well as the replicator equation simulations 
(see Fig. ^^)- If strategies can be mixed and we allow 
mutations on poOi we find poo — "^^fE^ as expected (data 
not shown). Indeed, it can be shown that for two-player 
games, the predicted population fraction fixed points are 
equal to the fixed points in the space of decision probabil- 
ities, and that the stability of these fixed points also co- 
incides pTj. For the three-player game, the fixed points of 
deterministic and stochastic strategic also coincide, but 
the stability conditions do not. 



III. THREE-PLAYER GAME 

So far we studied the adaptive change of genotype fre- 
quencies and expression probabilities, but we have not 
addressed the consequences of macroscopic mutations. 
While the toxin gene and the resistance gene are usually 
tightly linked on the same plasmid [35] , cells can acquire 
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resistance to the toxin without carrying the plasmid, for 
example by changes to a receptor or the membrane pro- 
tein that imports the colicin. However, such changes are 
usually costly because the same proteins are also involved 
in importing nutrients into the cell. 

To take mutations that create new strategies into ac- 
count, we introduce the additional type 'RO' (the type 
'OT', which we simulate using Darwinian evolution below, 
does not play a role here because it is never an ESS), and 
study how the possibility of mutating into this type af- 
fects the stability of the fixed points. In a general payoff 
matrix for the suicide-bomber game, non-resistant cells 
suffer an effect A > from exposure to the colicin, and 
carrying the toxin gene incurs a cost w while resistance 
decreases fitness hy S > 0. As before, the advantage of 
the RT phenotype is e. The payoff E is now: 



(11) 



With these values, the resistant type RO is superior to 
the resistant toxin producer RT but inferior to the wild- 
type 00 because of the cost of resistance. The dynamics 
of these three strains can become non-transitive so that 
all three strain can outcompete each other in a classic 
rock-paper-scissors (RPS) dynamic [T8'|. We can study 
the fixed points and phase portraits of this game the- 
oretically, as well as via agent-based simulations. For 
simplicity, we take A = 1 as before, and restrict our- 
selves to S — UJ (cost of resistance equal to cost of toxin). 
Furthermore, we can normalize the payoffs such that the 
diagonal vanishes, so that the payoff matrix becomes 




UJ 2uj~V 
E ^ \ -UJ UJ 

— 2uj —UJ 



(12) 



A. Stability and Zeeman classes 

The population dynamics of three-player games has 
been solved completely [3D], and crucially depends on 
the structure of fixed points in the interior or on the 
boundary of the 2-simplex A defined by the probabilities 
(poo,Pro,Prt) with the constraint poo + Pro + Pkt = 1- 
Using this 2-simplex, a simplified phase portrait of the 
dynamics can be constructed using a notation depicting 
attractors and repellers just as in Fig.[T] Zeeman showed 
that the dynamics fall into ten combinatorial classes (up 
to sign-reversal of each element of the payoff matrix), 
depending on the number of fixed points in the interior 
and on the boundary of the simplex. Each combinatorial 
class itself may contain one or more stable classes giv- 
ing rise to 19 different "stable" phase portraits plus their 
"inverses" , where all fiow directions are reversed and at- 
tractors are replaced by repellers and vice versa. Here, 
"stable" means that the phase portrait does not change 



drastically within a neighborhood of the parameter val- 
ues defining the game. [30] . The three-player suicide- 



bomber game as defined by ( 12 ) displays seven of these 



classes in seven regions, as shown in Fig. [3] and listed 
in Table [l] An interior fixed point only exists for the 
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FIG. 3. Phase portraits of stable dynamics for the three- 



player game using payoffs ( 12 1. The shaded parameter region 
Lo < ejie ^V) has an interior fixed point that can be repulsive 
or attractive. In Zeeman's phase portrait pictograms |30) . 
arrows denote the flow on the boundary of the simplex, solid 
circles are attractors and open circles are repellers. All fixed 
points on the boundary and the interior are indicated. The 
Zeeman classes for each region are indicated in Table [I] 

parameter region uj < , shown shaded in Fig. [Sj so 
that regions HI and IV of the two-player game now con- 
tain dynamics with a fixed point, and dynamics without. 
In region II, the wild-type 00 is a stable attractor just 
as in the two-player game. The only difference is that 
there are now many different paths from strategy RT to 
00 that include RO as an intermediate type. Region I 
(corresponding to the Prisoner's Dilemma region in the 
two-player game) has an interior fixed point (the Nash 
equilibrium) for payoff matrix (12) [36] 



Poo = 

PRO 

Prt 



1 - 

UJ 



UJ 

e 



UJ , 



(13) 

(14) 
(15) 



that is attractive when det E > Q (region la) and repul- 
sive for det E <Q [3(1 (region lb) . For our payoff matrix, 
the boundary is given by the line e = 1, as indicated in 
Fig.[3j The dynamic in this region is a rock-paper-scissors 
(RPS) game that is stable if e > 1, that is, the popula- 
tion fractions approach the interior fixed point given by 
Eqs. ( p][T5| ). For e < 1 the RPS game is unstable, and we 
expect to observe heteroclinic orbits that ultimately pass 
through the single-strategy fixed points. These oscilla- 
tions are similar to those observed by May and Leonard 
in the Lotka-Volterra dynamics of three species [37]. 
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Region 


Zeeman class 


la 


1 


lb 


-1 


II 


2 


Ilia 


52 


Illb 


5i 


IVa 


-52 


IVb 


-5i 



TABLE I. Zeeman classes [30] for the seven regions with stable 
dynamics. Note that regions Illa and IVa, as well as Illb and 
IVb are sign-opposites, that is, the matrix of signs of entries in 
the normalized payoff matrix (up to permutations) is reversed. 



The "snowdrift" region of the two-player game is now 
divided into region Ilia with a stable fixed point that is a 
mixture of RT and 00 only as in the two-player game, and 
a region Illb with a stable interior fixed point given by 
Eqs. ( l"3p5 ). The anti-coordination region of the two- 
player game (regions IVa and IVb off the three-player 
game) show just the inverse of the dynamics in regions 
Ilia and Illb, as outlined in Figure |3] Thus, in region 
IVb the RO-type is short-lived and the game reverts to 
a two-player game with its attendant stability properties 
(just as in region Ilia). 



A simulation of strategy competition using the replica- 
tor equations validates the phase portraits of the various 
Zeeman classes, as seen in Fig. |4^. For these simula- 
tions, we used starting conditions where all three strate- 
gies are equiprobable, and stopped after a finite num- 
ber of updates of the equations. As a consequence, a 
particular strategy appears to be dominant for the un- 
stable RPS game (blue in Fig. left) even though in 
fact the trajectory cycles through the three pure strate- 
gies. The expanding orbits of parameter combination A 
in region la become shrinking orbits for region lb (see 
parameter combination B), but because the orbits still 
touch the boundary, there is a chance of strategy extinc- 
tion in finite populations. Parameter combination C in 
Fig. is also in region lb, but because the orbits are 
tighter, extinction is unlikely even in finite populations. 
If e = 1 in region I, the orbits are limit cycles encircling 
the fixed points, but this dynamic is not "stable" in the 
sense of Zeeman as infinitesimal changes in the payoffs 
will change the trajectories qualitatively. Indeed, Zee- 
man proved that there are no structurally stable limit 
cycles in three-player games [30]. As e passes through 
the critical value s = 1, the flow exhibits a Hopf bifurca- 
tion [38j . 

Parameter combination D in Fig. lies in region Illb 
which has an interior fixed point. This point lies close 
to the boundary of region IIIA in which the fixed point 
is on the edge (complete extinction of strategy RO). In- 
deed, the equilibrium concentration of strategy RO for 
combination D is pro ~ 0.018. 



B. Finite populations 

If finite populations of pure strategy mixtures are sim- 
ulated using agent-based methods as described earlier for 
the two-player game (population size 16,384), the dynam- 
ics are unchanged from the infinite population-size limit 
for the parameter region II. However, the heteroclinic or- 
bits that we observed in region lb collapse when popula- 
tions are finite (as was noticed previously in Ref. [5S] for 
a generic RPS game in region lb), because both 00 and 
RT go to extinction at the RO fixed point (see Fig. |4|d) if 
the initial population consists of each strategy in equal 
concentration. Because of the nature of the flow, RO is 
the surviving strategy for almost all initial conditions, re- 
flecting a principle of "survival-of-the-weakest" discussed 
recently in the context of cyclic stochastic games [39] . In- 
terestingly, as we approach the boundary of region IVb, 
fortunes are reversed: now RO is dispensable and the 
game reverts to the two-player coordination game, where 
either RT or RO survive, depending on the initial condi- 
tion. 

Similar dynamics were also observed in experimental 
populations of sensitive, resistant, and toxin-producing 
bacteria engaged in an RPS game [18j as long as disper- 
sal was high, which corresponds to the well-mixed case 
that we study here. However, loss of diversity (strategy 
extinction) can also occur in region la where the flxed 
point is stable, as long as the orbits have a high proba- 
bility of touching the boundary before spiraling in. This 
dynamic is seen for point B in Fig.|4]3. For region III the 
flnite population size does not alter the phase portrait 
appreciably: the fixed point in the interior is stable, and 
even when one of the equilibrium concentrations is small 
(as is the case for parameter combination D), the minor- 
ity strategy does not disappear for the population size 
we studied. Of course, for sufficiently small populations, 
RO has a chance of extinction even in region Illb. 



C. Stochastic strategies 

As mentioned, cellular strategies in biology are often 
stochastic rather than pure. The decision to contribute 
to the stalk or the spore in the evolutionary game that 
Dictyostelium plays when resources become scarce is fully 
stochastic for example {p « 0.5 in the wild type 1171 ) . 
This type of stochasticity is different from simulating 
errors of execution and perception |40j . that focus on 
small deviations from the deterministic scenario, and 
have been used extensively in the economical literature 
(see, e.g., 01]). Stochastic strategies are often called 
"mixed" strategies in the literature, but we prefer not 
to use this nomenclature because "mixed" often evokes 
the idea of a mix of pure strategies (a polymorphic popu- 
lation). The fixed points of populations that use stochas- 
tic strategic have been studied less extensively than the 
phase portraits of deterministic populations, but some 
important results are known |42fl46| . For example, de- 
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FIG. 4. (a) Left: Dominating strategy as determined by replicator equations as a function of cost u and benefit e, where 
the density of each pure strategy is indicated by the intensity of the three colors red (00), green (RT), and blue (RO), given 
the starting condition where each strategy has 1/3 of the population. Regions as in Fig. [s] Right: population fraction 
trajectories obtained from iterating the replicator equations for four parameter combinations (e, a;): A = (0.75, 0.125), B = 
(1.25, 0.125), C = (1.75, 0.375) and D — (1.75, 0.625), whose locations are shown on the left. Arrows indicate the location of the 
fixed points, (b) Dominating strategies in a finite population-size agent-based simulation of competition between pure strategies 
in a mell-mixed population (left, 20 runs of 5,000 updates per pair of parameters, for 21x21 parameter combinations.). Right: 
the trajectories for the four parameter combinations picked in (a) show stable coexistence for points C and D, but extinction of 
strategies for A and B. (c) Left: Darwinian evolution of probabilistic strategies in an agent-based simulation at fixed mutation 
rate (left, 20 replicates per pair of parameters, 21x21 parameter combinations), where colors indicate average fixed point 
probabilities in the scheme of (a). The trajectories (right) represent the average probabilities to engage in the plays 00, RO, 
and RT on an (averaged) line of descent, not trajectories of population frequencies. Arrows indicate the predicted fixed point. 
Average of 50 replicate trajectories for 100,000 updates for each of the example parameters. 



fine a stochastic strategy S{q) = {qoQ, (/roi ^rt) where the 
Qi are the probabilities for this individual to engage in the 
plays i, and let its frequency in the population be f{q). 
Then the population mean strategy is 



q 



(16) 



where we have assumed a discretization of strategy space 
(these averages can be generalized to continuous strategy 
spaces, see, e.g. [i^ H5] )- 

If p is an ESS of the deterministic game defined by 



payoff matrix then the population mean (stochastic) 
strategy S is a locally stable equilibrium of the mean 
strategy evolution, defined by 



f{q) 



if and only if 



(S(g) - S) • £;S 



S = p 



(17) 



(18) 



In other words, the fixed points of the deterministic game 
are the fixed points (in the sense that the population is 
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"fixed" at S) of the stochastic game |33]. As a conse- 
quence, if p is an ESS, then so is S. However, nothing 
is known (to our knowledge) about the stabihty of fixed 
points of the stochastic game that are not attractors for 
the deterministic game. At the same time, even when the 
population is fixed at S, the population composition can 
change neutrally as long as the average S is preserved. 

To simulate the Darwinian evolution of probabilistic 
strategies we introduce separate loci for the 'R' and 'T' 
gene, encoding the probabilities qj^ and q^. Thus, a 
'00' phenotype is expressed with probability goo = (1 ~ 
te)(l-9T), while the RO phenotype has ^rq = qRil-qr) 
and so forth. While we therefore also simulate the possi- 
bility of a 'OT' phenotype, it never plays a dominant role 
in the dynamics as expected. Like in our simulation of the 
evolution of the two-player game, strategies play four ran- 
dom other players in the population to determine the fit- 
ness of the genotype (q, q^), which determines the proba- 
bility with which this strategy leaves offspring in the next 
generation. The probabilities qn, qx can be viewed as the 
stochastic decisions encoded by two different (and inde- 
pendent) genetic pathways with many genes, but rather 
than simulating how the genetic networks encode these 
probabilities, we evolve them directly. 

As before, we evolve the strategies using a death-birth 
Moran process, by removing at every update a random 
2% of a population of 1,024 genotypes in a well-mixed 
population, and replacing them with a set of genotypes 
that were picked (probabilistically, according to fitness) 
from the set of players that survived the 2% removal (we 
use a death-birth process rather than the more common 
birth-death process to avoid the awkward case that a 
genotype is born from a deleted type) . When a genotype 
is not replaced, fitness accumulates from playing more 
games, but this does not result in a skewed age distribu- 
tion as removal is random. 

To determine the mean "fixed point" strategy for each 
of 21 X 21 parameter configurations (Fig. |4]:, leftmost 
panel), we first reconstruct the line of descent (LOD) of 
each of 20 replicate populations evolved for 500,000 up- 
dates, by picking the dominating strategy genotype and 
follow its ancestral line all the way back to the seed geno- 
type (a strategy with qji = q^ = 0.5). The line of descent 
recapitulates the evolutionary history of that particular 
experiment, as it contains the sequence of mutations that 
gave rise to the successful strategy at the end of the run 
(see, e.g., [13 )■ The LOD also defines a trajectory in 
strategy space, but rather than being a smooth curve the 
LOD is jagged and jumps between probabilities. When 
averaging the LODs across runs to obtain the fixed point, 
we first discard the last 50,000 updates for each (in or- 
der to make sure that our LOD has coalesced) and then 
average the genotypes of the LOD after the first 250,000 
updates, in order to ensure that the trajectory settled on 
the fixed point (the procedure is described in more detail 
in [27]). 

To obtain the strategy trajectory S for the four param- 
eter combinations A-D used earlier, we collect the plays 



of each agent in the population at each update (instead of 
collecting the frequencies /(g) of each strategy S(g)), as 
the plays faithfully recapitulate the genotype in a well- 
mixed population [27'. The mean population strategy 
is plotted for the first 100,000 updates of an average of 
50 replicate experiments in Fig. |4]; (right four panels). 
As in previous work studying the evolution of stochas- 
tic strategies in the iterated Prisoner's Dilemma |27J, the 
population strategy moves to an evolutionary fixed point. 
In the present case (a non-iterated game), the fixed point 
is at (or close) to the Nash equilibrium of the determinis- 
tic game. While the fixed points B-D are attractors (but 
not ESSs), fixed point A is repulsive. For the stochas- 
tic game, however, this fixed point turns out to be an 
attractor. 

In hindsight, this is not surprising when the competi- 
tion is viewed in terms of an adaptive dynamics formal- 
ism (see, e.g., [3H1I1S]) as the population does not engage 
in a competition between three strategies that exclude 
each other. Rather, the population is fairly monomor- 
phic, centered around the single stochastic strategy with 
probabilities given by the Nash fixed point. Thus, rather 
than witnessing positive frequency-dependent selection 
in a competition of three types, we see stable mutation- 
selection balance of a single type. 

We note that the diversity of plays that we observe 
appears to contradict in part previous conclusions that 
local dispersal promotes diversity in microbial dynam- 
ics of the RPS-type [18]. These authors concluded that 
spatial interaction environments (nearest-neighbor as op- 
posed to well-mixed) promote diversity, observing coexis- 
tence when a well-mixed population simulation predicted 
extinction of two out of the three strategies, for parame- 
ter combinations that we estimate puts their population 
in the region IB or IVb, with a repelling fixed point. Yet, 
when phenotypes are expressed probabilistically, strate- 
gies are stabilized. However, while we know that at least 
the decision to express the 'T' type is stochastic in mi- 
crobial colicin phenotypes, the expression of the 'R' type 
may be deterministic. Thus, our simulations cannot be 
directly compared to the experiments of Ref. jH]. We 
should also keep in mind that we have not explicitly taken 
into account the effect of spatial structure [331 here, but 
instead simply varied the payoff e. However, the loss of 
diversity was predicted to occur in the well-mixed mode, 
which we show to be stable instead when decision are 
probabilistic. 



CONCLUSIONS 

We have studied the fixed points and phase portraits 
of populations engaged in game-theoretical dynamics in- 
spired by the microbial "suicide-bomber" game. Depend- 
ing on the physical parameter values that determine the 
payoffs of different decisions, all the well-known games of 
the two-player scenario are covered. When extending to 
a three-player game by decoupling resistance from toxin 
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production, we study the fixed points and stability not of 
all possible three-player games, but a subset of seven of 
38 possible phase portraits. When strategies are allowed 
to be stochastic rather than deterministic, we observe 
that the population mean strategy moves toward the de- 
terministic strategy fixed point, but repelling fixed points 
become attractive when decisions are probabilistic. We 
stress again that the original ESS concept is lacking for 
the three-player game, as none of the fixed points (even 
the attractive ones) are ESSs in the sense of Maynard 
Smith [J. 

It is difficult to ascertain which dynamics we should 
expect in natural populations, as the cost of production 
of the colicin as well as the cost of resistance vary con- 
siderably [3TJ 132] . In general, we should expect that the 
costs are different, and that the efficiency of killing due 
to bacteriocins is not 100%, so that the more general 
model (11) should be used. But given these caveats. 



our predictions of evolutionary fixed points as a func- 
tion of cost and benefit parameters should be testable by 
dedicated experiments of the sort described in |18j, by 
changing the parameters of the evolutionary game (for 
example by experimentally varying the cost of resistance 
or benefit of the toxin). Recently, Nahum et al. [32^ have 
studied the evolution by natural selection of the param- 
eters that define the game, and found that the resistant 
type (here, type RO) tended to evolve towards more re- 
strained interactions (higher 6) compared to populations 
that evolved in the absence of interactions, at least in the 
case that migration was restricted so that spatial effects 
are present. This appears to be a direct verification of the 



"survival-of-the- weakest" principle [311 ISO] that is appli- 
cable in environments where finite population size intro- 
duces stochasticity, in parameter region lb of the three- 
player game. Clearly it will be interesting to see more 
general evolutionary trajectories within the uj, £-space (or 
the more general (5, w, e-space), under different realistic 
constraints for spatial interactions, as well as costs of re- 
sistance and toxin production that are constrained to lie 
within biologically reasonable assumptions. It has pre- 
viously been shown that changing environmental condi- 
tions can change the dynamics fro a prisoner's Dilemma 
to a snowdrift game (moving from region I into region 
III in Fig. [I]) in the two-player game [TH [5T] . It would 
be interesting to test whether populations can be coaxed 
to move from one Zeeman class to another in the three- 
player game, simply be changing the selective pressures 
acting on the system. 
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